Skip to content

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented Aug 25, 2025

Adds support for "tagged queries" to the LuceneCountOperator. LuceneCountOperator is the Lucene native implementation for queries like:

FROM foo
| STATS COUNT(*)

This is something we can often implement very quickly using Lucene's statistics. It's also used for:

FROM foo
| WHERE a > 10
| STATS COUNT(*)

Here we can't use statistics, but Lucene's queries have count methods on them that can be very very fast because they use pre-calculated statistics. For example, the filter cache stores the number of hits and count method runs in O(1) time. And we need this Operator to use it.

"Tagged queries" support means we should be able to use this same operator for cases like:

FROM foo
| STATS COUNT(*) BY DATE_TRUNC(1 DAY, @timestamp)

This doesn't plug that in to the query planner, but we should be able to do so after this PR. Which would bring us to parity with agg ala: https://www.elastic.co/blog/how-we-made-date-histogram-aggregations-faster-than-ever-in-elasticsearch-7-11

Adds support for "tagged queries" to the `LuceneCountOperator`.
`LuceneCountOperator` is the Lucene native implementation for queries
like:
```
FROM foo
| STATS COUNT(*)
```

This is something we can often implement very quickly using Lucene's
statistics. It's also used for:
```
FROM foo
| WHERE a > 10
| STATS COUNT(*)
```

Here we can't use statistics, but Lucene's queries have `count` methods
on them that *can* be very very fast because they use pre-calculated
statistics. For example, the filter cache stores the number of hits and
`count` method runs in O(1) time. And we need this Operator to use it.

"Tagged queries" support means we should be able to use this same
operator for cases like:
```
FROM foo
| STATS COUNT(*) BY DATE_TRUNC(1 DAY, @timestamp)
```

This doesn't plug that in to the query planner, but we should be able to
do so after this PR. Which would bring us to parity with agg ala:
https://www.elastic.co/blog/how-we-made-date-histogram-aggregations-faster-than-ever-in-elasticsearch-7-11
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 25, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)


/** Returns a deep copy of the given block, using the blockFactory for creating the copy block. */
public static Block deepCopyOf(Block block, BlockFactory blockFactory) {
// TODO preserve constants here.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to do this now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Nik!

}
}

private List<ElementType> tagTypes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: final for tagTypes and tagsToState?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}

private Page buildNonConstantBlocksResult() {
BlockUtils.BuilderWrapper[] builders = new BlockUtils.BuilderWrapper[1 + tagTypes.size()];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be tagTypes.size() instead of 1 + tagTypes.size()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. checking.

@nik9000
Copy link
Member Author

nik9000 commented Aug 26, 2025

Now that #133510 is in I'm going to update the test using it.

@nik9000 nik9000 enabled auto-merge (squash) August 26, 2025 15:53
@nik9000 nik9000 merged commit 935e773 into elastic:main Aug 27, 2025
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants